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Friending recommendation has successfully contributed to 
the explosive growth of on-line social networks. Most friend- 
ing recommendation services today aim to support passive 
friending, where a user passively selects friending targets 
from the recommended candidates. In this paper, we advo- 
cate recommendation support for active friending, where a 
user actively specifies a friending target. To the best of our 
knowledge, a recommendation designed to provide guidance 
for a user to systematically approach his friending target, 
has not been explored in existing on-line social networking 
services. To maximize the probability that the friending tar- 
get would accept an invitation from the user, we formulate 
a new optimization problem, namely, Acceptance Probabil- 
ity Maximization (APM), and develop a polynomial time 
algorithm, called Selective Invitation with Tree and In-Node 
Aggregation (SITINA), to find the optimal solution. We 
implement an active friending service with SITINA in Face- 
book to validate our idea. Our user study and experimental 
results manifest that SITINA outperforms manual selection 
and the baseline approach in solution quality efficiently. 
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1. INTRODUCTION 

Due to the development and popularity of social network- 
ing services, such as Facebook, Google+, and Linkedln, the 
new notion of "social network friending" has appeared in 
recent years. To boost the growth of their user bases, ex- 
isting social networking services usually provide friending 
recommendations to their users, encouraging them to send 
invitations to make more friends. Conventionally, friend- 
ing recommendations are made following a passive friend- 
ing strategy, i.e., a user passively selects candidates from 
the provided recommendation list to send the invitations. 
Moreover, the recommended candidates are usually friends- 
of-friends of the user, especially those who share many corn- 
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mon friends with the user. This strategy is quite intuitive 
because friends-of-friends may have been acquaintances or 
friends offline. Furthermore, most users may feel more com- 
fortable to send a friending invitation to friends-of-friends 
rather than a total stranger who they have shared no social 
connections with at all. It is envisaged that the success rate 
of such a passive friending strategy is high, contributing to 
the explosive growth of on-line social networking services. 

In contrast to the passive friending, the idea of active 
friending, where a person may take proactive actions to 
make friend with another person, does exist in our everyday 
life. For example, in a high school, a student fan may like 
to make friend with the captain in the school soccer team 
or with the lead singer in a rock-and-roll band of the school. 
A salesperson may be interested in getting acquainted with 
a high-value potential customer in hope of making a busi- 
ness pitch. A young KDD researcher may desire to make 
friend with the leaders of the community to participate in 
conference organizations and services. However, to the best 
of our knowledge, the idea of providing friending recommen- 
dations to assist and guide a user to effectively approaching 
another person for active friending has not been explored in 
existing on-line social networking services. We argue that 
social networking service providers, interested in exploring 
new revenues and further growth of their user bases, may be 
interested in supporting active friending. 

One may argue that, in existing social networking ser- 
vices, an active friending initiator can send an invitation 
directly to the friending target anywayQ However, it may 
not work if the initiator is regarded as a stranger by the 
target, especially when they are socially distant, i.e., they 
have no common friends. Therefore, to increase the chance 
that the target would accept the friending invitation, it may 
be a good idea for the initiator to first know some friends of 
the target, which in turn may require the initiator to know 
some friends of friends of the target. In other words, if the 
initiator would like to plan for some actions, he may need 
the topological information of the social network between 
the target and himself, which unfortunately is not available 
due to privacy concerns. Therefore, it would be very nice if 
the social networking service providers, given a target spec- 
ified by the initiator, could provide a step-by-step guidance 
in form of recommendations to assist the initiator to make 
friends towards the target. 

In this paper, we are making a grand suggestion for the 
social networking service providers to support active friend- 
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ing. Our sketch is as follows. By iteratively recommending 
a list of candidates who are friends of at least one existing 
friend of the initiator, a social networking service provider 
may support active friending, without violating the current 
practice of privacy preservation in recommendations. Con- 
sider an initiator who specifies a friending target. The so- 
cial networking service, based on its proprietary algorithms, 
recommends a set of friending candidates who may likely 
increase the chance for the target to accept the eventual 
invitation from the initiator. Similar to the recommenda- 
tions for passive friending, the recommendation list consists 
of only the friends of existing friends of the initiator. Sup- 
posedly, the initiator follows the recommendations to send 
invitations to candidates in the list. The invitation is dis- 
played to a candidate along with the list of common friends 
between the initiator and the candidate so as to encourage 
acceptance of the invitation^ As such, the aforementioned 
step is repeated until the friending target appears in the rec- 
ommendation list and an invitation is sent by the initiator. 
Obviously, the recommendations made for passive friend- 
ing may not work well because active friending is target- 
oriented. The recommended candidates should be carefully 
chosen for the initiator, guiding him to approach the friend- 
ing target step-by-step. 

To support active friending, the key issue is on the de- 
sign of the algorithms that select the recommendation can- 
didates. A simple scheme is to provide recommendations by 
unveiling the shortest path between the initiator and the tar- 
get in the social network, i.e., recommending one candidate 
at each step along the path. As such, the initiator can grad- 
ually approach the target by acquainting the individuals on 
the path. However, this shortest-path recommendation ap- 
proach may fail as soon as a middle-person does not accept 
the friending invitation (since only one candidate is included 
in the recommendation list for each step). To address this 
issue, it is desirable to recommend multiple candidates at 
each step since the initiator is more likely to share more 
common friends with the target and thereby more likely 
to get accepted by the target. Especially, by broadcasting 
the friending invitations to all neighbors of the initiator's 
friends, the probability to reach the friending target and get 
accepted can be effectively maximized as enormous number 
of paths are flooded with invitations to approach the target. 
Nevertheless, friending invitations are abused here because 
the above undirectional broadcast is aimless and prone to in- 
volve many unnecessary neighbors. Moreover, the initiator 
may not want to handle a large number of tedious invita- 
tions. 

In this paper, we study a new optimization problem, called 
Acceptance Probability Maximization (APM), for active friend- 
ing in on-line social networks. The service providers, who 
eager to explore new monetary tools for revenue increase, 
may consider to charge the users from active friending ser- 
vice0 Given an initiator s, a friending target t, and the 
maximal number ra of invitations allowed to be issued by 
the initiator, APM finds a set R of rn nodes, such that s 
can sequentially send invitations to the nodes in R in order 
to approach t. The objective is to maximize the acceptance 

2 This is also a common practice for passive friending in ex- 
isting social networking services such as Facebook, Google+, 
and Linkedln. 

3 Recent news reported that Facebook now allows its user to 
pay to promote their and their friends' posts pQ. 



probability at t of the friending invitation when s send it 
to t. The parameter m controls the trade-off between the 
expected acceptance probability of t and the anticipated ef- 
forts made by s for active friending tQ Again, R is not 
returned to s as a whole due to privacy concerns. Instead, 
only a subset R s of nodes that are adjacent to the existing 
friends of s are recommended to s, while other subsets of R 
will be recommended to s as appropriate in later stepfl 

To tackle the APM problem, we propose three algorithms: 
i) Range-based Greedy (RG) algorithm, ii) Selective Invita- 
tion with Tree Aggregation (SITA) algorithm, and iii) Selec- 
tive Invitation with Tree and In-Node Aggregation (SITINA) 
algorithm. RG selects candidates by taking into account 
their acceptance probability and the remaining budget of 
invitations, leading to the best recommendations for each 
step. However, the algorithm does not achieve the opti- 
mal acceptance probability of the invitation to a target due 
to the lack of coordinated friending efforts. On the other 
hand, aiming to systematically select the nodes for recom- 
mendation, SITA is designed by dynamic programming to 
find nodes which may result in a coordinated friending effort 
to increase the acceptance probability of the target. SITA is 
able to obtain the optimal solution, yet has an exponential 
time complexity. To address the efficiency issue, SITINA 
further refines the ideas in SITA by carefully aggregating 
some information gathered during processing to alleviate re- 
dundant computation in future steps and thus obtains the 
optimal solution for APM in polynomial time. The contri- 
butions of this paper are summarized as follows. 

• We advocate for the idea of active friending in on-line 
social networks and propose to support active friending 
through a series of recommendation lists which serve 
as a step-to-step guidance for the initiator. 

• We formulate a new optimization problem, namely, 
Acceptance Probability Maximization (APM), for con- 
figuring the recommendation lists in the active friend- 
ing process. APM aims to maximize the acceptance 
probability of the invitation from the initiator to the 
friending target, by recommending selective interme- 
diate friends to approach the target. 

• We propose a number of new algorithms for APM. 
Among them, Selective Invitation with Tree and In- 
Node Aggregation (SITINA) derives the optimal solu- 
tion for APM with 0(nvrn 2 ) time, where ny is the 
number of nodes in a social network, and m is the 
number of invitations budgeted for APM. 

• We implement SITINA in Facebook in support of ac- 
tive friending and conduct a user study including 169 

4 Since s is not aware of the network topology and the dis- 
tance to t, it is not reasonable to let s directly specify m. 
Instead, it is more promising for the service provider to list 
a set of r_R and the corresponding acceptance probabilities 
and monetary costs, so that the user can choose a proper r_R 
according to her available budget. 

5 In this paper, APM is formulated as an offline optimization 
problem aiming to maximize the acceptance probability in 
expectation. In an on-line scenario where the initiator does 
not send invitations to some nodes in R s or some nodes in 
R a do not accept the invitations, a new APM with renewed 
invitation budget could be re-issued to obtain adapted rec- 
ommendations. While this scenario raises important issues, 
it is beyond the scope of this paper. 



volunteers with varied background. The user study 
and experimental results manifest that SITINA out- 
performs manual selection and the baseline approach 
in solution quality efficiently. 

The rest of this paper is organized as follows. Section [5] 
introduces a model for invitation acceptance and formulates 
APM. Section|3]reviews the related work. Section 2] presents 
the SITA and SITINA algorithms proposed for APM. Sec- 
tion [5] reports our user study and experimental results. Fi- 
nally, Section [S] concludes the paper. 

2. INVITATION ACCEPTANCE 

The notion of acceptance probability is with respect to an 
invitation. Thus, here we first discuss two important fac- 
tors that may affect the acceptance probability of a friend- 
ing invitation in the environment of on-line social network- 
ing services and describe how in this work we determine 
whether an individual would accept a received invitation. 
Next, we explain why the issue of deriving the acceptance 
probability over a social network is very challenging and how 
we address this issue by adopting an approximate probabil- 
ity based on a maximum influence in-arborescence (MIIA) 
tree. We formulate the acceptance probability maximization 
(APM) problem based on the MIIA tree. The invitation ac- 
ceptance model follows the existing social influence and ho- 
mophily models, which have been justified in the literature. 
Later in Section [5] the invitation acceptance model will be 
validated by a user study with 169 volunteers. 

2.1 Factors for Invitation Acceptance 

In the process of active friending, while friending candi- 
dates are recommended for the initiator to send invitations, 
whether the invitees will accept the invitations remains un- 
certain. Based on prior research in sociology and on-line 
social networks [121 1201 I21j , we argue that when a person 
receives an invitation over an on-line social network, the 
decision of the invitee depends primarily on two important 
factors: i) the social influence factor 13, 14 , and ii) the ho- 
mophily factor [9l 1121 [21] . Here, the social influence factor 
represents the influence from the surroundings (i.e., com- 
mon friends) of individuals in the social network on the de- 
cision. On the other hand, the homophily factor captures 
the fact that each individual in a social network has a dis- 
tinctive set of personal characteristics, and the similarities 
and compatibilities among characteristics of two individuals 
can strongly influence whether they will become friends [12] . 
Between them, social influence comes from established social 
links, while the homophily between two individuals may ex- 
ist without a prerequisite of established social relationship. 
Thus, we consider these two factors separately but aim to 
treat them in a uniformed fashion in our derivation of the 
acceptance probability for an invitation. 

As the social influence factor involves the structure of so- 
cial network (i.e., the common friends of the individuals), 
we first consider the acceptance probability of an invita- 
tion in terms of social influence^ Let the social network be 
represented as a social graph G(V, E) where V consists of 
all the users in the social networking system and E be the 
established social links among the users. An edge weight 
w u ,v £ [0, 1] on the directed edge [u, v) £ E probabilistically 

6 We intend to extend it with homophily factor later. 



denotes the social influence of u upon v. The probability can 
be derived according to the existing method |13l 114] accord- 
ing to the interaction in on-line social networks, while the 
setting of negative social influence has also been introduced 
in [3]. Thus, if u is associated with an invitation from a 
user s to v (i.e., u is a common friend of s and v), w UtV is 
the probability for v to be socially influenced by u to ac- 
cept the invitationQ Hence, the acceptance probability for 
an invitation can be derived by taking into account the so- 
cial influences of all the existing common friends associated 
with an invitation. It is assumed that each common friend 
u has an independent social influence on the invitee v to ac- 
cept the friending invitation [9l 1121 [20] and thus the overall 
acceptance probability can be obtained by aggregating the 
individual social influences. Later, user study in Section [S] 
demonstrates that the influence probability and homophily 
probability derived according to the literature are consistent 
to the real probabilities measured from the users. 

While obtaining the acceptance probability for a given 
invitation (as described above) is simple, deriving the ac- 
ceptance probability for a friending target t who does not 
have any common friend with the initiator s becomes very 
challenging because more than one invitations need to be 
issued (so as to make some common friends first) , and there 
are complicated correlations among user acceptance events 
for users between s and t. 

Moreover, our ultimate task is to find a set R of interme- 
diate users between s and t with size at most r_R for s to send 
invitations to, so as to maximize the acceptance probability 
of t. We call this problem the acceptance probability maxi- 
mization (APM) problem. Due to the combinatorial nature 
of this invitation set ii, it is still hard to find such a set to 
maximize the acceptance probability of t even in cases where 
computing the acceptance probability is easy. The following 
theorem makes the above two hardness precise. 

Theorem 1. Given the set of neighbors S of the initia- 
tor, computing the acceptance probability of t is #P-hard. 
Moreover, finding a set R with size rn that maximizes the 
acceptance probability of target t is NP-hard, even for cases 
when computing acceptance probability is easy. 

Proof. We first prove that computing the acceptance 
probability of t with given R is #P-hard. Let Gr denote 
the induced subgraph of G with s, t, and 7?. Let Gr denote 
a directed subgraph of Gr by removing every edge (u, v) in 
Gr if u does not influence v to accept an invitation, either 
because u does not become a friend of s or v does not accept 
the invitation due to the social influence from u. Therefore, 
t will finally be a friend of s if there exists a path in Gr, rep- 
resenting that every node in the path, including t, accepts 
the friend invitation from s. Apparently, if the probabil- 
ity of social influence associated with each edge is 0.5, the 
probability that t accepts the friend invitation is the number 
of subgraph Gr with t accepting, divided by the number of 
possible subgraph Gr, which is 2" E , where tie is the number 
of directed edges. In other words, after acquiring the accept- 
ing probability, the number of subgraph Gr with t accepting 
can be computed immediately by multiplying 2" E . 

7 The social influence probability has been extensively used 
to quantify the probability of success in the process of confor- 
mit y, assimilation, and persuasion in Social Psychology [5| 
1121 120] . While ho w to obtain the edge weight is an active 
research topic [131 [24] , it is out of scope of this paper. 




Figure 1: An illustration graph of building the APM 
instance 

We prove that computing the acceptance probability of t 
is #P-hard with the reduction from a #P-complete problem, 
called s-t connectedness problem [27] . that finds the number 
of subgraphs in a directed graph Gc with a directed path 
from s to t. We let Gr = Gc and assign the probability of 
social influence with each edge as 0.5. With the observation 
in the previous paragraph, if finding the accepting probabil- 
ity of t is not #P-complete, s-t connectedness problem is not 
#P-complete because the number of subgraphs in Gc with a 
directed path from s to t is simply the accepting probability 
of t multiplied by 2" E . 

Moveover, even for cases when computing the acceptance 
probability of t is easy, finding a set R that maximize the 
acceptance probability of target t is NP-hard in IC model. 
We prove it with a reduction from the set cover problem. For 
a bipartite graph (X, Y, E), set cover problem aims to iden- 
tify whether there exists a fc-node subset Xs of X covering 
all nodes in Y, i.e., for any y G Y, there exists an x G Xs 
with (x,y) G E. Let us denote \Y\ — z y . For an instance 
of set cover problem, we build an instance for computing 
the acceptance probability of t as follows, and an illustra- 
tion figure is shown in Figure [1] 1) We add a node s and a 
directed edge (s, x) for each x G X with weight w(s, x) = 1. 
Notice that s is the only one node with acceptance prob- 
ability 1 in the beginning. 2) We add a node t and a di- 
rected edge (y,t) for each y G Y with weight w(y,t) — p, 
V G (0,1). 3) We set the w(e) = 1 for each e G E. We 
prove that there is a fc-node subset Xs C X covering all 
nodes in Y in the set cover problem if and only if there is 
a solution with acceptance probability 1 — (1 — p) Zy when 
selecting m = k + z y + 1 nodes in computing the acceptance 
probability^ We first prove the sufficient condition. If there 
exists a fc-node subset Xs covering all nodes in Y, selecting 
Xs U Y U {t} (totally k + z y + 1 nodes) will obtain accep- 
tance probability 1 — (1 — p) Zy . We then prove the necessary 
condition. If there is a solution R with rn nodes obtaining 
acceptance probability 1 — (1 — p) Zy , R must contain t and 
all nodes in Y, and the set R n X (totally m — z y — 1 = k 
nodes) must cover all nodes in Y. Thus selecting a suitable 
R is NP-hard. The theorem follows. □ 

2.2 Approximate Acceptance Probability 

The spread maximization problem in Independent Cas- 
cade (IC) model [17] also faces the challenge in Theorem PlPl 

8 Notice that 1 — (1 — p) Zy is the maximum probability when 
including all nodes in X U Y U t into R, thus it is obvious 
the maximum probability when selecting m nodes. 
9 The spread maximization problem, which also adopts a 



To efficiently address this issue, an approximate IC model, 
called MIA HJ [5j [6], has been proposed. The social influ- 
ence from a person u to another person v is effectively ap- 
proximated by their maximum influence path (MIP), where 
the social influence w u ,v on the path (u, v) is the maximum 
weight among all the possible paths from u to v. MIA creates 
a maximum influence in-arborescence, i.e., a directed tree, 
MIIA(t, 9) including the union of every MIP to t with the 
probability of social influence at least 9 from a set S of leaf 
nodes. The MIA model has been widely adopted to describe 
the social influence in the literature [U [5j [6] with the follow- 
ing definition on activation probability, which basically is the 
same as the acceptance probability if s broadcasts friending 
invitations to all nodes in MIIA(t,9). 

Definition 1. The activation probability of a node v in 
MIIA(t, 9) is ap'(v, S, MIIA(t, 9))) = 
1, ifv G S 
0, if N ln (v) = ® 
1 - YlueN'"(v) (1 - ap'{u, S, MIIA(t, 9)) ■ w u ,v), otherwise 
where N ln (v) is the set of m-neighbors of v. 

Note that ap'(u,S,MIIA{t,9)) ■ w u ,v is the joint proba- 
bility that u is activated and successfully influences v, and 
u can never influences v if it is not activated. Therefore, the 
activation probability of a node v can be derived accord- 
ing to the activation probability of all its in-neighbors, i.e., 
child nodes in the tree. Since S is the set the leaf nodes, the 
activation probabilities of all nodes in Mil Ait,, 9) can be 
derived in a bottom-up manner from S toward t efficiently. 

In light of the similarity between the IC model and the 
decision model for invitation acceptance in active friending 
with no budget limitation of invitations, we also exploit MIA 
to tackle the APM problem. MIIA(t, 9) is constructed by 
the MIPs from all friends of s to t, i.e., S is the set of friends 
of s. In other words, 9 is set as to ensure that the social 
influence from every friend is fully incorporated. Neverthe- 
less, different from the activation probability in the litera- 
ture, which allows the influence to propagate via every node 
in MIIA(t,9), the acceptance probability for active friend- 
ing allows the social influence to take effect on invitation 
acceptance only via a set R of nodes to be selected in our 
problem. Thus, we define the acceptance probability for an 
invitation to node v as follows. 

Definition 2. The acceptance probability for an invita- 
tion of a nodev inMIIA(t,9) isap(v,S,R,MIIA(t,9))) = 

f 1, ifv G S 

I 0,ifv<£Ror N™ = 

I 1 - n„rf« W , U £ji (1 - S, R, MIIA(t, 9)) ■ w u , v ) 

[ , otherwise 

where N m (s) is the set of in-neighbors of s. 

Equipped with MIA, we are able to derive the acceptance 
probability of t efficiently with a simple iterative approach 
from the leaf nodes to the root (i.e., t). The above MIA ar- 
borescence incorporates only the social influence factor. As 

probabilistic influence model, is different from APM in this 
paper. Given an initiator s and his friends, APM intends 
to discover an effective subgraph (i.e., R) between the seeds 
and t. On the other hand, the spread maximization prob- 
lem, given the topology of the whole social network, aims 
to find a given number of seeds to maximize the size of the 
whole spread t. 




Figure 2: Combining the social influence and ho- 
mophily factors 

discussed earlier, the homophily factor between the initiator 
and the receiver of an invitation is also crucial for friending. 
Homophily in 9, 12, 2Tj represents the probability for two 
individuals u and v to create a new social link due to shared 
common personal characteristics. Homophily in Sociology 
manifests the general tendency of people to associate with 
others and similar others can be quantified with varied ap- 
proaches [21 1161 I3U| . The homophily probability can be set 
according to [3]- 

To extend MIA, we attach a duplicated s to each node 
with a directed edge, with a parameter specifying the ho- 
mophily factor from s to ». The MIP from each candidate 
to t, together with the directed edge from s to the candi- 
date, is incorporated in the extended MIA. Therefore, the 
extended MIA is also an arborescence, where each leaf node 
is a friend of s or s herself, and those leaf nodes make up 
the set S. 

Figure [2] shows an example of the extended MIA. For each 
internal node, such as vi, its acceptance probability factors 
is not only the social influence from 113 and V4 but also the 
homophily factor between s and v\. 

In this paper, the influence probability and homophily 
probability are derived according to the above literature 
without associating them with different weights. Later, user 
study will be presented in Section [S] and the results show 
that the real acceptance probability complies with the ac- 
ceptance probability of the above model. 

2.3 Problem Formulation 

In this work, we formulate an optimization problem, called 
Acceptance Probability Maximization (APM), to select a given 
number of intermediate people to systematically approach 
the friending target t based on MIIA(t, 9). The APM prob- 
lem is formally defined as follows. 

Acceptance Probability Maximization (APM). Given 
a social network G(V, E), an initiator s and a friending tar- 
get t, select a set R of tr users for s to send friending invi- 
tations such that the acceptance probability 
ap(t, S, R, MIIA(v, 9)) is maximized, where S is the friends 
of s, including s itself. 

As analyzed later, the optimal solution to APM can be 
obtained in 0(nvrR 2 ) timePI. where nv is the number of 
nodes in a social network, and tr is the total number of 
invitations allowed. The setting of tr has been discussed in 
Section [T] It is worth noting that APM maximizes the ac- 
ceptance probability of t, instead of minimizing the number 

10 MIA was proposed to simplify IC model, which is compu- 
tation intensive and not scalable. Nevertheless, we prove 
that APM in IC model is NP-hard in Theorem [1] and not 
submodular by displaying a counter example in Appendix. 



of iterations to approach t, which can be achieved by the 
shortest path routing in an on-line social network. Never- 
theless, it is possible to extend APM by limiting the number 
of edges in an MIP of MIIA(t, 9), to avoid incurring an un- 
accepted number of iterations in active friending. 

3. RELATED WORK 

Recommendation for passive friending has been explored 
in the past few years. Chen et al. [3] manifest that friending 
recommendations based on the topology of an on-line social 
network are the easiest way leading to the acceptance of an 
invitation. In contrast, recommendations based on contents 
posted by users are very powerful for discovering potential 
new friends with similar interests [3]. Meanwhile, research 
shows that preference extracted from social networking ap- 
plications can be exploited for recommendations [15]. To 
avoid recommending socially distant candidates, users are 
allowed to specify different social constraints [23], e.g., the 
distance between a user and the recommended friending tar- 
gets, to limit the scope of friending recommendation. More- 
over, community information has been explored for recom- 
mendation [25] . Notice that the aforementioned research 
work and ideas are proposed for passive friending, where 
the friending targets are determined by the recommendation 
engines of social networking service providers in accordance 
with various criteria (e.g., preference and social closeness, 
etc). Thus, the user can conveniently (but passively) send 
an invitation to targets on the recommendation list. Com- 
plementary to the conventional passive friending paradigm, 
in this paper, we propose the notion of active friending where 
a friending target can be specified by the initiator. Accord- 
ingly, the recommendation service may assist and guide the 
initiator to actively approach a target. 

The impact of social influence has been demonstrated in 
various applications, such as viral marketing [6l 1171 [18] and 
interest inference [29]. Given an on-line social network, a 
major research problem is the seed selection problem, where 
the seeds correspond to the leaf nodes of MIA (i.e., initia- 
tor s and her friends) in our problem. In contrast, APM 
is to select the topology between the friends and t, instead 
of selecting the seeds. The homophily factor, capturing the 
tendency of users to connect with similar ones, has been con- 
sidered in several applications, such that identifying trusted 
users [20] and users relationships [31] in social networks. 

Notice that some works develop algorithms to return a 
subgraph or path, such as community detection [15], short- 
est path [ID], pattern matching [TT], or graph isomorphism 
query [8]. In contrast to the shortest path query, our algo- 
rithms for the APM problem make emphasize on returning 
a graph, instead of a path. The topology of the returned 
graph contains valuable neighborhood information of some 
common friends who can be leveraged to effectively increase 
the acceptance probability of a friending invitation. The ini- 
tiator of a pattern matching or a graph isomorphism query 
needs to specify a subgraph as the query input. In contrast, 
this paper aims at finding an unknown graph between s and 
t to maximize the acceptance probability of a invitation to 
a friending target. 

4. ALGORITHM DESIGN 

To tackle the APM problem, we aim to design efficient al- 
gorithms in support of the invitation recommendations for 



active friending. From our earlier discussions , it is easy to 
observe that the set of intermediate nodes in R, i.e., those to 
be recommended for invitation, play a crucial role in maxi- 
mizing the acceptance probability for active friending. Here 
we first introduce a range-based greedy algorithm which pro- 
vides some good insights for our other algorithms. 

The algorithm, given an invitation budget rR, aims to find 
the set of invitation candidates for recommendations to an 
initiator s who would like to make friends with a target t. 
Let R denote the answer set, which is initialized as empty 
at the beginning. The algorithm iteratively selects a node 
v from the neighbors of s's current friends and adds it to 
R based on two heuristics: 1) the highest acceptance prob- 
ability and 2) the number of remaining invitations. The 
former aims to minimize the potential waste of a friending 
invitation, while the latter avoids selecting a node too far 
away to reach t by constraining that v can only be at most 
tr — \R\ — 1 hops away from t. As a result, the range-based 
greedy algorithm is inclined to first expand the friend terri- 
tory of s and then approach towards the neighborhood of t 
aggressively. 

4.1 Selective Invitation with Tree Aggregation 

While the range-based greedy algorithm is intuitive, the 
nodes added to R at separate iterations are not selected in 
a coordinated fashion. Thus, it is difficult for the range- 
based greedy algorithm to effectively maximize the accep- 
tance probability. To address this issue, we propose a dy- 
namic programming algorithm, call Selective Invitation with 
Tree Aggregation (SITA), that finds the optimal solution for 
APM by exploring the maximum influence in-arborescence 
tree rooted at t (i.e., MIIA(t,9)) in a bottom-up fash- 
ion. SITA starts from the leaf nodes, i.e., nodes without 
in-neighbors, to explore MIIA(t, 9) in a topological order 
until t is reached finally. In order to obtain the optimal so- 
lution, SITA needs to explore various allocations of the rR 
invitations to different nodes close to s or t in Mil Ait, 9). 
However, it is not necessary for SITA to enumerate all pos- 
sible invitation allocations. Thanks to the tree structure 
of M 1 1 A{t,9), for each node v, SITA systematically sum- 
marizes the best allocation for v, i.e. which generates the 
highest acceptance probability for v, corresponding to the 
subtree rooted at v. The summaries will be exploited later 
by v's parent node, i.e., the only out-neighbor of v, to iden- 
tify the allocation generating the highest probability. The 
above procedure is repeated iteratively until t is processed, 
and the allocation of rR invitations to the subtree rooted at 
t is the solution returned by SITA. 

More specifically, let f v>r denote the maximum acceptance 
probability for v to accept the invitation from s while r 
invitations have been sent to the subtree rooted at v in 
Mil Ait, 9). By first sorting all nodes in topological order to 
t, we process /„, r of a node v after all f Uif of its in-neighbors 
u have been processed. Apparently, f v$ o = for every node 
v that is not a friend of s because no invitation will be sent 
to the subtree rooted at v. In contrast, for every leaf node 
v, which is a friend of s (or s itself), f v>r — 1 for r — 0. For 
all other nodes v in Mil Ait, 0), SITA derives f VjT according 
to each in-neighbor f Ui ,ri as follows, 

fv.r = max {1- TT [1 - f Ui , r . ■ w Ui ,v]}, (1) 

^ ri~ r— 1 

Ui£N irl (v) 

where N m (v) denotes the set of in-neighbors of v with 
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Figure 3: The running example(not including s and 
her edges) 

\N ln (v) j = d v , Ui is an in-neighbor of v, and ri is the number 
of invitations sent from s to the subtree rooted at Ui. An in- 
vitation is sent to v, while the remaining r — 1 invitations are 
distributed to the in-neighbors of v. SITA effectively avoids 
examining all possible distribution of the r — 1 invitations 
to the nodes in the subtree. Instead, Eq. {TJ examines only 
fui,ri of each in- neighbor Ui of v on every possible number 
of invitations ri. In other words, only the in-neighbors of v, 
instead of all nodes in the subtree, participate in the compu- 
tation of f VtT to efficiently reduce the computation involved. 
For each node v, /„, r is derived in ascending order of r until 
reaching r = min(rR, z v ), where z v is the number of nodes 
that are not friends of s in the subtree rooted at v. SITA 
stops after ft,r R is obtained. In the following, we show that 
SITA finds the optimal solution to APM(3 

Lemma 1. Algorithm SITA answers the optimal solution 
to APM. 

Proof. We prove the lemma by contradiction. Assume 
that the solution from SITA, i.e., ft,r R , is not optimal. Ac- 
cording to the recurrence, there must exist at least one in- 
neighbor t\ £ JVJ™ together with the number of invitations 
ri such that ft lt ri is not optimal. Similarly, since ft 1]ri is 
not optimal, there exists at least one in-neighbor ti £ N™ 
of ti with the number of invitations r^ such that ft 2 ,r 2 is 
non-optimal, r^ < T\. Here in the proof, let /i iiri denote 
the non-optimal solution found in i-th iteration of the above 
backtracking process, which will continue and eventually end 
with a probability /t i>r< such that 1) t% is a friend of s but 
/t;,o / 1 or fti.ri = for ri > 0, or 2) ti is not a friend of 
s but ft it o 7^ 0. The above two cases contradict the initial 
assignment of SITA. The lemma follows. □ 

Example. Figure [3] illustrates an example of MIIA(t,9) 
with rR — 7, where the nodes denote the users involved in 
deriving the maximal acceptance probability for t and the 
numbers labeled on edges denote the influence probability 
between two nodes. Without loss of generality, s and her 
homophily edges are not shown in Figure Note that the 
dark nodes at the leaf are s and her existing friends and thus 
have the acceptance probability as 1, while the white nodes 
are the recommendation candidates to be returned by SITA 

n Due to the space constraint, we do not show the pseudo- 
code of SITA here but refer the readers to the next section 
where a more general SITINA is presented. 



along with their acceptance probabilities. SITA explores 
MIIA(t, 9) from the dark leaf nodes in a topological order, 
i.e., the f v>r of a node v is derived after all f Ujr of its in- 
neighbors u are processed. Take U4 as an example. /« 4j o = 
since no invitation is sent and f U4 ,i = 1 — (1 — fu 5 ,o • 0.75) = 
0.75. Similarly, for us, f ug ,o = and fu s ,i = 0.95. Consider 
u§ which has in-neighbors u 7 and Us, fu 6 ,o = 0, fu 6 ,i = 0.8, 
and f ua>2 = 1 - (1 - fu 7 ,o ■ 0.8)(1 - /„ 8 ,i • 0.7) = 0.933. No- 
tice that for a node v, f v , r is derived for r £ [0, min(z v ,rii)], 
e.g., for «6, we only derive r € [0,2]. Nevertheless, to find 
fv,r, SITA needs to try different allocations by distribut- 
ing the number of invitations n to each different neighbor 
Ui and then combining the solutions f Ui ,ri to acquire f ViT . 
For example, to derive f ui7 ,5, it is necessary to distribute 
4 invitation to its in-neighbors, including u\s, U20 and U24. 
The possible allocations for (rig, 7-20 ,£24) include (0,1,3), 
(0,2,2), (1,0,3), (1,1,2) and (1,2, 1]3 which will obtain 
acceptance probability 0.5738, 0.7539, 0.7081, 0.6674 and 
0.7639 respectively. Eventually, we obtain f uiv ,5 = 0.7639. 
Notice that the number of possible allocations grows ex- 
ponentially. After all the nodes are processed, we obtain 
,ft,7 = 0.7483. On the other hand, the greedy algorithm RG 
selects a user v £ R with the highest acceptance probability 
and at most rn — \R\ — 1 hops away from t. Accordingly, 
it selects us, u@, 1115, U12, and U3 sequentially. In the 6th 
step, the node with the highest probability is W10. However, 
itio is 3-hops away with 3 > r_R — |72| — 1 = 2 and thus not 
selected. Instead, it selects the node with the next highest 
acceptance probability, i.e., ui. In the last step, only the 
root t can be selected, so RG obtains a solution with the 
acceptance probability as 0.4013. As shown, SITA outper- 
forms RG. 

4.2 Selective Invitation with Tree and In-Node 
Aggregation 

Unfortunately, SITA is not a polynomial-time algorithm 
because in Eq. JTJ, 0(r d ") allocations are examined to 
distribute n — 1 invitations to the subtrees of the d v in- 
neighbors corresponding to each node v. To remedy this 
scalability issue, we propose Selective Invitation with Tree 
and In-Node Aggregation (SITINA) to answer APM in poly- 
nomial time. SITINA effectively avoids processing of 0(r ") 
allocations by iteratively finding the best allocation for the 
first k in-neighbors, which in turn is then exploited to iden- 
tify the best allocation for the first k + 1 in-neighbors. The 
process iterates from k — 1 till k = d v . Consequently, the 
possible allocations for distributing — 1 invitations to all 
in-neighbors are returned by Eq. |T} in 0(d v rn) time, where 
d v is the in-degree of v in MIIA(t,9). 

To efficiently derive f v>r in Eq. ^Q, we number the in- 
neighbors of « as ui, «2- to Ud v , where d v is the in-degree 
of v. Let m v ,k,x denote the maximum acceptance proba- 
bility by sending x invitations to the subtrees of the first 
k neighbors of v, i.e., ui to u k . Initially, m v> i :X = fui,x, 
x 6 [0, 7\r]. SITINA derives m Vl k, x according to the best 
result of the first k — 1 in-neighbors, 

m V)k ,x= max {l-[l-m v>k ^i :X ^ x i}[l-f UktX iw UhtV \}, 

(2) 

where f Uk ,x' w u k ,v is the acceptance probability for allocat- 
ing x' invitations to the fc-th in-neighbor u k , and m v k-i x - x ' 
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x = l 


x = 2 


x = 3 


x = 4 


x — 5 


x — 6 


1 


0.315 


* 






* 




2 


0.315 


0.4931 


0.6528 




* 


* 


3 


0.32 


0.5342 


0.6674 


0.7639 


0.8314 


0.8520 



2 Some allocations are eliminated since Ti ^ [0, min(jR, z u 



Table 1: All m v 



is the best solution for allocating x — x' invitations to the 
first k — 1 in-neighbors. By carefully examining different x , 
we can obtain the best solution m v ^,x for a given k. 

SITINA starts from k = 1 to k = d v . For each k, SITINA 
begins with x — until k = min(5^ ig j 1 fc , z Ui , ra — 1), where 
53»e[i k] Zu i * s t ne total number of nodes that are not friends 
of s in the subtrees of the first k in-neighbors. SITINA 
stops after finding every m v ^,x, x € [0, min(z v , rn — 1)]. 
The pseudocode is presented in Algorithm [T] and the fol- 
lowing lemma indicates that the optimal solution of APM is 
mt,d t ,r R -i- 

Lemma 2. For any v and r, f v>r = m v> d v ,i — 1. 

Proof. We prove the lemma by contradiction. Assume 
that m V) d v ,i — 1 is not optimal. According to the recurrence, 
there exists at least one such that m Vi ,i v -i,(r R -i-r d ) is 
not optimal. Similarly, since m Vt d v -i,(i 1-1- d ) i s n °t opti- 
mal, there exists at least one rd v -i such that 
m v,d v -2,{i — i-r d -r d _ 1 ) is n °t optimal. Therefore, let 
mv,dv-^,(r-l-<T ^ ), where a t = J2j£[o,i-i] r dv-j> denote the 
non-optimal solution obtained in the i-th iteration. The 
backtracking process continues and eventually ends with 
i = d v — 1, where m Vt i. ri 7^ f ui ,x- It contradicts the ini- 
tial assignment of m,,i in , and the lemma follows. □ 

The following theorem proves that the algorithm answers 
the optimal solution to APM in 0(nvrR 2 ) time, where nv 
is the number of nodes in a social network, and r_R is the 
number of invitations in APM. Note that any algorithm for 
APM is fl(ny) time because reading MIIA(t,8) as the in- 
put graph requires fi(ny) time. Therefore, SITINA is very 
efficient, especially in a large social network with nv signif- 
icantly larger than d ma x and ra- 

Theorem 2. SITINA Algorithm answers the optimal so- 
lution to APM in 0(nvrn 2 ) time. 

Proof. According to Lemma [T] and Lemma [2] SITINA 
obtains the optimal solution of APM. Recall that nv is the 
number of nodes in the social network, and d v is the in- 
degree of a node v in MIIA(t,Q). The algorithm contains 
0(nv) iterations. Each iteration examines a node v to find 
m v ,d v ,x for every x € [0, mm(z v — 1,Tr — 1)], where rn is 
number of invitations sent by s in APM. There are 0(d v rn) 
cases to be considered to explore all m v ,d v , x for v in Eq. 
p)l. and each case requires O(rij) time. Therefore, find- 
ing m V: d v ,x for a node v needs 0{d v rn 2 ) time, and for all 
nodes in MIIA{t,6) is 0(J2 v d v r R 2 ), where d v = \E\. 
As MIIA(t,9) is a tree (i.e. |-E| = nv — 1), the overall time 
complexity is 0(nvrn 2 ). The theorem follows. □ 

Example. In the following, we illustrate how SITINA de- 
rives fu 17 ,r, r £ [0, z ui7 ]. At the begin ning , the in-neighbors 
of U17 are ordered as uis, U20 and it24L_| Then, we find all 

13 To avoid confusing, we keep their ID in this example with- 
out renaming them as it„ 17 , u\ xl , and u z uxl . 



Algorithm 1 Selective Invitation with Tree and In-Node 
Aggregation (SITINA) 

Require: The query issuer s; the targeted user t; the influ- 
ence tree MIIA(t, 8) rooted at t; the number of requests 
rn that s can send. 
Ensure: A set R of selected users that s sends requests to, 
such that the acceptance probability is maximized. 
1: Obtain a topological order a which orders a node with- 
out in-neighbor first, 
for v 6 a do 

//obtain all f v , r , r € [0, min(r\R, n v )] 
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Order in-neighbors of v as u v , w^,... UZ" 
mv,o,r <— for Vr £ [0, min(rn — 1, n r — 
for k = 1 to d v do 

for r — 1 to min(rii — 1, n v — 1) do 
x = r — 1 
m Vt k,x = 
for a' = to r do 

if m Vl h,us < l—[l—mv,k-i,x-x , ][l—fn 



[1 - m v k -i <x - x t][l 



then 

^ ,x' ^ ,v\ 
^v,k,x ' -T 

/»,o <- 

«- m Vl k,x, Va; G [0, min(r R - 1, n r - 1))] 
Backtrack ir V] k,x to obtain 
return i? with maximized /t,r B 



mu 17 ,i l!C = /ui B ,a;-iiWin8,vi T , a; e [0, mm(z„ 18 , - 1)] first, 
representing the maximum acceptance probability Un ob- 
tained by only sending x invitations to the subtree rooted 
at the first in-neighbor, i.e., uis- Then we derive m uir ,2,x 
for x g [0, min(z uis + z U20 ,rR — 1)] to acquire the maxi- 
mum acceptance probability of uu by sending invitations 
to subtrees rooted at uis and U2o- Notice that different 
x' , representing the invitations distributed to the fe-th sub- 
tree, needs to be examined in order to find the optimal 
solution. For instance, while deriving vn ul7 ^A, we com- 
pare 1 - (1 - m ui7 ,2,i)(l - /u 24 ,3 x w U24iU1T ) = 0.7081, 
1 - (1 - m ui7 ,2,2)(l - /u 2 4,2) x w U2iiU17 ) = 0.7539 and 1 - 
(1 - ra ui7 ,2,3)(l - /«24,i) x w "24,"it) = 0.7417 and obtain 
m« 17 ,3,4 = 0.7539. After deriving all f ui7 ,x+i = m ui7 ^,x 
for x € [0, 6] (min(rn — 1, z ui7 -i) = 6), the computation of 
Wi7 finishes. Table 1 lists the detailed results, where * de- 
notes the instances with r exceeding the number of people 
who are not the friends of s in the first fe subtrees (3 



5. PERFORMANCE EVALUATION 

We implement active friending in Facebook and conduct 
a user study and a comprehensive set of experiments to val- 
idate our idea of active friending and to evaluate the perfor- 
mance of the proposed algorithms. In the following, we first 
detail the methodology of our evaluation and then present 
the results of our user study and experiments, respectively. 

5.1 Methodology 

We adopt a user study and experiments, two complemen- 
tary approaches, for the performance evaluation. We aim to 
use the user study to investigate how the recommendation- 
based active friending approach is faring with the approach 



based on the users' own strategies (i.e., which they would 
follow under the existing environment of social networking 
services). To perform the user study, we implement an app. 
on Facebook. Through the app., the user is able to decide 
whom to invite based on their own strategies to approach the 
target. Meanwhile, according to the recommendations gen- 
erated from the Range-Based Greedy (RG) algorithm and 
the Selective Invitation with Tree and In-Node Aggregation 
(SITINA) algorithm, respectively, the user also sends al- 
ternative sets of invitations to proceed the active friending 
activities for comparison[3 Note that Selective Invitation 
with Tree Aggregation (SITA) is not considered because it 
makes exactly the same recommendations as SITINA. We re- 
cruited 169 volunteers to participate in the user study. Each 
volunteer is given 25 targets with varied invitation budgets 
to work on. The social distances between the volunteer and 
the targets are pre-determined in order to collect results for 
comparison under controlled parameter settings. 

On the other hand, we conduct experiments by simula- 
tion to evaluate the solution quality and efficiency of SITA, 
SITINA, and RG, implemented in an HP DL580 server with 
four Intel Xeon E7-4870 2.4 GHz CPUs and 128 GB RAM. 
Two large real datasets, FacebookData and FlickrData are 
used in the experiments. FacebookData contains 60,290 users 
and 1,545,686 friend links crawled from Facebook [28], and 
FlickrData contains 1,846,198 users and 22,613,981 friend 
links crawled from Flickr |22| . The initiator s and target t 
are selected uniformly at random. 

An important issue faced in both of our user study and 
the experiments is the social influence and homophily fac- 
tors captured in the social network, which are required for 
RG, SITA and SITINA to make recommendations. Most of 
previous works adopt a fixed probability (e.g., 1/degree in 
[171 H3 SI E]) or randomly choose a probability from a set 
a values (e.g., 0.001, 0.01, 0.1 in [6] g]) due to the lack of 
real social influence probabilities and homophily probabili- 
ties. To address this issue, in the user study, we obtain the 
social influence probability on each edge by mining the inter- 
action history of volunteers in Facebook in accordance with 
[131 114] , We also derive the homophily probabilities from 
s to other nodes by mining the profile information in their 
Facebook pages based on [3], The social network in the user 
study is denoted as User Study Data. As for the social net- 
works in FacebookData and FlickrData that are to be used 
for experiments, we unfortunately do not have personal pro- 
files and historical interactions of the nodes. Thus, we could 
not generate the social influence probability and homophily 
probability by mining real data. As a result, we choose to 
assign the link weights of the social network based on: i) the 
distributions of social influence and homophily probabilities 
obtained from our user study (denoted as US), and ii) the 
Zipf distribution for its ability to capture many phenomena 
studied in the physical and social sciences [32] . 

5.2 User Study 

Through the user study, we have logged the responses of 
participants to invitations and thus are able to calculate the 
acceptance probabilities corresponding to invitations under 
various circumstances. Using the collected data, we make a 
number of comparisons. 



l Note that m U7 



k,x 



when x = 0. 



To alleviate the burden of the participants, we send invita- 
tions on their behalves to the recommended candidates. 
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Figure 5: Acceptance probability in user study 



First, we would like to verify that the acceptance prob- 
ability of an active friending plan derived based on MIIA 
tree (using the mined social influence and homophily prob- 
abilities as the link weights) are consistent with that of the 
plan being executed in the user study. Towards this goal, we 
first verify the accuracy of our invitation acceptance model 
(for single invitation) by comparing the derived acceptance 
probability and the actual acceptance probability obtained 
from real activities in the user study. Figure [4j a), where re- 
sults obtained from the user study and our model are respec- 
tively labeled as Actual and Derived, plots the comparison in 
terms of the number of common friends in an invitation. As 
shown, the acceptance probabilities of both User and Model 
increase as the number of common friends in invitations in- 
creases. Most importantly, the results are consistently close, 
showing our invitation acceptance model (and the social in- 
fluence and homophily weights used) are able to reasonably 
capture the decision making upon invitations in real life. 

Notice that the above comparison focusses on the aspect 
of invitation acceptance only, without taking into account 
the social network topology, which we approximate with the 
MIIA tree. To verify that using the MIIA tree is sufficiently 
effective for active friending planning, we further compare 
the acceptance probability derived using our proposed al- 
gorithms and the actual acceptance probability obtained 
through executing the plan in the user study. Figure |4jb) 
shows that, under various distance between initiator and 
target, the acceptance probabilities derived using MIIA tree 
is reasonably close to the actual acceptance probabilities. 

Next, we compare the effectiveness of strategies based on 
RG, SITINA and the participants' own heuristics. Fig- 
ure 0(a) plots the comparison by varying the number of 
friending invitations, r R . RG and SITINA generally out- 
perform user heuristics (labeled as User) under all settings. 
We can observe that the performance of SITINA is generally 
very good and getting better as r R increases, while the per- 
formance of User and RG have a leap from r R — 5 to 10 and 
remain close afterwards. This indicates the extra compu- 
tation effort required for deriving recommendations due to 
the increased invitation budget are worthwhile, outperform- 
ing the heuristic strategies derived based on RG and human 
intuition. Figure [SJb) evaluates the acceptance probability 
of t under varied settings of d s ,t- When d s ,t is 2, it is more 
likely to have a lot of common friends (due to the nature of 
social networks) and thus getting better acceptance proba- 
bilities. When d Sl t increases, it becomes more difficult for 
an initiator to make effective decisions due to the less num- 
ber of common friends and the lack of knowledge about the 
larger and more complex social network topology behind. 
As shown in Figure [5] SITINA has the best performance. 
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5.3 Experimental Results 

While the user study verifies that SITINA is able to achieve 
the best performance, the size of social network is small 
due to the limited number of volunteers participating in the 
study. To further validate our ideas in a large-scale social 
network and to evaluate the scalability of SITINA, we con- 
duct an experimental study by simulations. 

5.3.1 Scalability 

As proved earlier, SITA can obtain the optimal solution of 
APM. However, it is not scalable as it needs to examine all 
combinations of invitation allocations. Here we use it as a 
baseline to compare the efficacy and efficiency with SITINA 
over social networks of different sizes. First, we compare 
the results by randomly sampling 50 (initiator, target) pairs 
using UserStudyData. As Figure [6] depicts, both SITA and 
SITINA significantly outperform RG in terms of acceptance 
probability. Next, we compare their running time, not only 
using UserStudyData but also the large-scale FacebookData 
and FlickrData. As shown in Figure [7] the SITA algorithm 
takes more than 7 days without returning the answer and 
thus not feasible for practical use.. For the rest of experi- 
ments, we only compare SITINA with RG. 

5.3.2 Sensitivity Tests 

In this section, we conduct a series of sensitivity tests to 
examine the impact of different parameters, including the in- 
vitation budget (r R ), the number of friends of s (N), the dis- 
tance between the initiator and target (d a ,t), and the skew- 
ness of social influence and homophily probabilities (a). In 
experiments on the impacts of r R , N, d s , t , we have tested 
both the FacebookData (US) and FlickrData (US)0 As the 
observations on both datasets are quite similar, we only re- 
port both results for the first experiment and skip the Flick- 
rData result for the rest due to space constraint. Finally, in 
the last experiment, we use FacebookData (ZF) to observe 
how a may potentially impact our algorithms. 
Impact of r R . By varying r R and setting the default d 3t t 



16 US and ZF denote the link weights assigned based on mod- 
els from User Study and Zipfian Distribution. 
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Figure 8: Varying tr (FacebookData (US)) 
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Figure 9: Varying tr (FlickrData (US)) 



of sampled (s, t) pairs as 4, we compare SITINA and RG in 
terms of the acceptance probability and the number of iter- 
ations using FacebookData (US) and FlickrData (US) (see 
Figure [5] and Figure respectively). As shown in Figure 
[SJa) and[^a), SITINA exhibits much better performance 
than RG, regardless of the tr. Meanwhile, Figure [8jb) and 
[§Jb) manifest that the longest path in the solution R ob- 
tained by SITINA is shorter than that in RG because RG 
tends to spend invitations on some local users with higher 
acceptance probabilities F1 

Impact of N. We are interested in finding whether the 
number of friends of s has an impact on the performance. 
Thus, we choose four different groups of initiators s (who 
have around 100, 200, 300, and 400 friends, respectively) 
and sample 100 different targets t to compare their accep- 
tance probabilities. With tr set as 25, Figure [10] shows 
that as the number of friends increases, the initiators have 
more choices to reach their targets. As shown, SITINA can 
find the optimal solution with high acceptance probability, 
while near-sighted RG tends to select the friends of friends 
with higher acceptance probabilities and eventually results 
in small acceptance probability to t. 

Impact of d s ,t- We also conduct an experiment to under- 
stand the impact of d s ,t on the performance. Not surpris- 
ingly, the finding is consistent with our user study (please 
refer to Section [52] and Figure EJb)). Thus, we do not plot 
the result here due to the space constraint. 
Impact of a. In the experiments above, social influences 
and homophily factors are modeled based on our user study, 
but the distributions in different social networks may vary. 
Thus, through the skewness parameter a, we use Zipf dis- 
tribution, to examine the impact of a on our algorithms. 
As shown in Figure ITT1 we can observe that as the distribu- 
tions of social influence and homophily become more skewed 
(i.e., a increases), the acceptance probabilities of SITINA 
and RG drops, because it becomes more difficult for invita- 
tions to get accepted when there are less number of highly 



1 RG is inclined to take more time to reach t because in- 
vitations are sequentially sent towards t. The latency of 
friending a new intermediate node is different for each node. 




Figure 10: Varying TV Figure 11: Varying a 
(FacebookData (US)) (FacebookData (ZF)) 

influential links while the number of less influential links in- 
creases. It is also worth noting that, as the line in the figure 
indicates, the percentage of performance difference between 
SITINA and RG (i.e., acceptance prob. of SITINA divided 
by that of RG) increases, showing that SITINA is able to 
handle skewed distribution much better than RG. 

6. CONCLUSION AND FUTURE WORK 

Observing the need of active friending in everyday life, 
this paper formulates a new optimization problem, named 
Acceptance Probability Maximization (APM), for making 
friending recommendations on-line social networks. We pro- 
pose Algorithm Selective Invitation with Tree and In-Node 
Aggregation (SITINA), to find the optimal solution for APM 
and implement SITINA in Facebook. User study and exper- 
imental results manifest that active friending can effectively 
maximize the acceptance probability of the friending target. 

In our future work, we will first explore the impact of delay 
between sending an invitation and acquiring the result in 
active friending. This is important when the user would like 
to make friends with the target within certain time frame. In 
addition, for multiple friending targets, it is not efficient to 
configure recommendations separately for each target. An 
idea is to give priority to the intermediate nodes that can 
approach many targets simultaneously. We will study active 
friending of a group of targets in the future work. 
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APPENDIX 

We display that APM is not submodular by the following 
counter example with four users. The user a is the existing 
friend of s, i.e., S = {a}, while b, c, and t are non-friend users 
of s. The influence probability is labeled beside each edge. 
Consider adding a new user c to two different set of selected 
users Rs = {t} and Rt = {b, t}, where Rs C Rt- If the 
submodular property holds, ap(t, S, RsU{c}, MIIA(t, 0))) — 
ap{t,S,R s ,MIIA(t,8))) > ap(t,S,R T U{c},MIIA{t,9)))- 
ap(t, S, Rt, MIIAit, 0))) should hold. The acceptance prob- 
ability of selecting Rs, i.e., ap(t,S,Rs,MIIA(t,6))), is 
since there is no path from a to t. Similarly, ap(t, S, Rs U 
{c}, MIIA(t, 6))) = 0. The acceptance probability of select- 
ing Rt is 1 — (1 — 0.9 x 0.1) = 0.09, and adding c into Rt re- 
sults in acceptance probability ap(t, S, 7?tU{c}, MIIA(t, 9)) 
1 I 0.09)(1 - 0.9) = 0.909. However, ap(t,S,R s U 
{c},MIIA(t,e)))~ap{t,S,R s ,MIIA(t,6))) = < ap{t,S,R T U 
{c},MIIA{t,6)))-ap{t,S,R T ,MIIA(t,6))) = 0.909-0.09 = 
0.819. There is a counter example and the submodular prop- 
erty does not hold in APM. 




friends of s 
non-friends of s 



